Using a Mixture of N-Best Lists from Multiple MT Systems in Rank-Sum-Based Confidence Measure for MT Outputs
نویسندگان
چکیده
This paper addressees the problem of eliminating unsatisfactory outputs from machine translation (MT) systems. The authors intend to eliminate unsatisfactory MT outputs by using confidence measures. Confidence measures for MT outputs include the rank-sum-based confidence measure (RSCM) for statistical machine translation (SMT) systems. RSCM can be applied to non-SMT systems but does not always work well on them. This paper proposes an alternative RSCM that adopts a mixture of the N-best lists from multiple MT systems instead of a single-system’s N-best list in the existing RSCM. In most cases, the proposed RSCM proved to work better than the existing RSCM on two non-SMT systems and to work as well as the existing RSCM on an SMT system.
منابع مشابه
Combining Outputs from Multiple Machine Translation Systems
Currently there are several approaches to machine translation (MT) based on different paradigms; e.g., phrasal, hierarchical and syntax-based. These three approaches yield similar translation accuracy despite using fairly different levels of linguistic knowledge. The availability of such a variety of systems has led to a growing interest toward finding better translations by combining outputs f...
متن کاملWord Confidence Estimation for SMT N-best List Re-ranking
This paper proposes to use Word Confidence Estimation (WCE) information to improve MT outputs via N-best list reranking. From the confidence label assigned for each word in the MT hypothesis, we add six scores to the baseline loglinear model in order to re-rank the N-best list. Firstly, the correlation between the WCE-based sentence-level scores and the conventional evaluation scores (BLEU, TER...
متن کاملRemoving Biases from Trainable MT Metrics by Using Self-Training
Most trainable machine translation (MT) metrics train their weights on human judgments of state-of-the-art MT systems outputs. This makes trainable metrics biases in many ways. One of them is preferring longer translations. These biased metrics when used for tuning are evaluating different types of translations – n-best lists of translations with very diverse quality. Systems tuned with these m...
متن کاملCombination of Machine Translation Systems via Hypothesis Selection from Combined N-Best Lists
Different approaches in machine translation achieve similar translation quality with a variety of translations in the output. Recently it has been shown, that it is possible to leverage the individual strengths of various systems and improve the overall translation quality by combining translation outputs. In this paper we present a method of hypothesis selection which is relatively simple comp...
متن کاملAutomatic Ranking of Machine Translation Outputs Using Linguistic Factors
Machine Translation is the challenging problem in Indian languages. The main goal of MT research are to develop an MT systems that consistently provide high accuracy translations and that have broad coverage to handle the full range of languages. At an age of Internet and Globalization MT have a great demand. Since MT is an automated system; therefore, it is not necessary that the system will p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004